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IS.  abstract  iConiinua  on  ravaraa  if  nacaaoary  and  idanlity  by  MocF  num6«r> 

Pictures  help  people  to  comprehend  and  remember  texts.  The  goal  of  this  project  is  to  begin  to  understand  how 
this  occurs.  TTiis  Final  Technical  Report  describes  progress  in  three  areas.  First,  we  have  demonstrated  that 
pictures  are  used  to  modify  the  mental  representation  derived  from  texts.  When  reading  with  pictures,  people 
tend  to  form  mental  models,  even  when  reading  in  relatively  unfamiliar  domains.  These  mental  models  are 
representations  of  what  the  text  is  about  (in  contrast  to  representations  of  the  text  itself),  they  have  an  analogical 
character,  and  they  are  constructed  using  the  visual/spatial  sketchpad  of  working  memory.  Second,  we  have 
documented  some  comprehension  processes  that  are  affected  by  pictures  and  some  that  are  not.  In  particular, 
ease  of  anaphor  resolution  is  independent  of  the  pre.sence  or  absence  of  pictures.  On  the  other  hand,  pictures 
enhance  the  reader’s  ability  to  compute  a  particular  kind  of  elaborative  inference  that  we  call  noticing.  These 
inferences  are  derived  from  spatial  relations  within  the  mental  model,  but  need  not  represent  spatial  information. 
Third,  we  describe  a  computer  simulation  that  demonstrates  how  the  various  processes  and  representations 
identified  experimentally  can  be  coordinated  using  a  limited-capacity  system. 
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A.  List  of  Objectives 
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Leafing  through  textbooks,  manuals,  and  newspapers  will  quickly  demonstrate  that  pictorial  infonoation  is 
common  in  non-narrative  texts.  The  belief  shared  by  authors  and  readers  is  that  the  pictorial  information  assists 
comprehension  and  memory  for  the  information.  In  fact,  the  vast  majority  of  the  experimental  literature 
investigating  the  effects  of  pictures  on  comprehension  corroborates  this  belief.  Nonetheless,  there  is  little  in  that 
literature  to  suggest  how  it  is  that  pictures  produce  this  benefit.  The  intent  of  the  research  described  in  this  report 
is  to  discover  the  cognitive  mechanisms  underlying  the  beneficial  effects  of  pictures  in  text  The  strategy  will  be  to 
work  on  two  closely  related  objectives.  Objective  1:  Identify  how  pictures  (diagrams)  modify  the  representation 
of  information  gained  from  text.  Objective  2:  Identify  how  pictures  (diagrams)  modify  the  processing  of  textual 
information. 


Investigation  of  these  objectives  will  proceed  by  contrasting  two  theoretical  views.  The  first  is  Paivio's 
(1986)  dual-coding  theory  which  proposes  that  pictorial  and  verbal  information  are  represented  as  separate  codes. 
According  to  this  view,  pictures  help  comprehension  by  ensuring  the  existence  of  an  additional  code  which  may 
be  used  to  assist  remembering.  The  second  view  is  derived  from  the  mental  model  interpretation  of  text 
comprehension.  According  to  this  view,  the  goal  of  comprehension  processes  is  to  represent  the  objects  and 
events  described  by  the  text,  rather  than  a  representation  of  the  text  itself  (Gamham,  1987;  Glenberg,  Meyer,  & 
Lindem,  1987).  This  view  proposes  that  the  mental  model  integrates  pictorial  and  verbal  information  in  a  single 
code  (although  separate  pictorial  and  verbal  codes  may  also  be  generated).  Pictures  help  comprehension  by 
assisting  construction  of  a  mental  model  (that  is,  an  integrated  representation  of  the  referent  situation  described  by 
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We  have  made  substantial  progress  on  identifying  how  pictures  affect  the  representation 
gleaned  from  texts.  In  overview,  a  primary  effect  is  that  pictures  help  people  to  build  mental  models  of  the  text. 

A  .secondary  effect  is  that  pictures  also  seem  to  provide  a  long-term  imaginal  code  that  can  benefit  performance  on 
comprehension  tests. 


As  a  cognitive  representation  of  a  text,  a  mental  model  is  a  representation  of  what  a  text  is  about,  not  a 
representation  of  the  linguistic  entities  and  structure  of  the  text  itself.  Although  there  are  several  representational 
formats  that  are  consistent  with  this  definition  (e.g.,  Johnson-Laird,  1983;  Van  Dijk  and  Kintsch,  1983:  Just  and 
Carpenter,  1991),  we  believe  that  such  models  are  often  based  on  dimensions  of  experience.  In  particular,  it 
appears  that  mental  models  are  often  spatial,  making  use  of  our  abilities  to  represent  spatial  relations  and 
manipulate  cognitive  entities  within  such  a  representation.  Thus,  when  reading  a  description  of  the  Taj  Mahal  (or 
of  a  non-existent  object  such  as  the  colossus  of  Ozymandius)  we  are  able  to  develop  a  representation  of  what  the 
object  looks  like  (including  its  spatial  extent),  not  just  a  representation  of  the  words  and  sentences  used  to 
describe  the  object.  Spatial  representations  may  also  be  used  to  represent  non-spatial  information.  Thus,  the 
order  of  steps  in  a  procedure  may  be  represented  as  a  spatially  arrayed  flowchart,  or  the  energy  levels  of  sub¬ 
atomic  particles  may  be  repre.sented  (cognitively)  as  locations  along  a  horizontal  dimension  representing  amount  of 
energy. 

In  what  sense  do  mental  models  capture  the  idea  of  comprehension?  One  function  of  language  is  to  inform 
us  about  objects  and  events  that  are  not  perceptually  available.  If  language  is  to  be  of  use  in  this  regard,  it  must 
direct  us  to  represent  the  new  information  in  a  useful  format.  That  is,  the  format  should  facilitate  recognition  and 
manipulation  of  the  objects.  Because  most  objects  have  a  spatial  extent,  comprehension  of  text  virtually  requires  a 
spatial,  model-like  representation  if  the  text  is  to  be  useful.  Glenberg,  Kxuley,  and  Langston  (in  press)  give 
details  regarding  such  a  representation. 


How  do  pictures  help  comprehension?  Data  reponed  in  Glenberg  and  Langston  (1992)  are  the  primary 
evidence  for  our  claim  that  pictures  help  people  to  build  mental  models.  In  those  experiments,  students  read  brief 
descriptions  of  four-step  procedures  (e.g.,  how  to  write  a  term  paper).  In  the  critical  texts,  the  text  explicitly 
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stated  that  the  first  step  was  followed  by  steps  two  and  three  which  were  performed  simultaneously,  and  then  the 
fourth  step.  Although  steps  2  and  3  are  performed  simultaneously,  by  necessity,  they  are  described  one  after  the 
other.  Half  of  the  subjects  read  the  texts  alone,  and  half  read  the  texts  accompanied  by  a  flowchart  that  illustrated 
the  order  of  the  four  steps.  Following  the  reading  of  each  text,  we  assessed  the  strength  of  the  relationship 
between  steps  that  were  described  near  each  other  in  the  text  (e.g.,  steps  1  and  2  or  steps  3  and  4)  or  steps  that 
were  described  far  from  each  other  in  the  text  (e.g.,  steps  1  and  3  or  steps  2  and  4).  Note  however,  that  for  all  of 
these  pairs,  the  steps  within  a  pair  occur  immediately  after  one  another  when  the  procedure  is  executed.  Thus,  if 
subjects  are  representing  the  order  of  the  steps  in  the  procedure  (a  mental  model  of  the  procedure)  the  relation 
between  steps  in  a  far  pair  should  be  just  as  strong  as  the  relation  between  steps  in  a  near  pair;  in  the  procedure, 
steps  in  these  pairs  are  adjacent  However,  if  subjects  are  representing  the  order  of  the  steps  in  the  text,  the 
relation  between  steps  in  a  far  pair  should  be  weaker  than  the  relation  between  steps  in  a  near  pair. 

The  data  were  very  clear.  When  the  texts  were  accompanied  by  a  picture,  the  subjects  formed  a  mental 
model  (representation  of  the  order  of  the  steps  in  the  procedure),  whereas  when  the  texts  were  read  alone,  the 
subjects  formed  a  representation  of  the  text.  Various  control  conditions  allowed  us  to  eliminate  alternative 
explanations  based  on  dual-code  theory,  motivational  factors  associated  with  pictures,  and  .selective  repetition  of 
information  presented  in  pictures.  In  addition,  we  were  able  to  demonstrate  that  when  the  pictures  depicted  the 
order  in  which  the  steps  were  described  in  the  text  (thus  the  pictures  reinforced  the  text,  but  not  the  procedure 
iLself),  comprehension  suffered  compared  to  having  no  pictures  at  all. 

These  results  are  significant  for  several  reasons.  First,  the  are  among  the  first  to  demonstrate  how  it  is  that 
pictures  facilitate  comprehension.  At  the  risk  of  redundancy  :  Pictures  help  readers  to  construct  mental  models. 
Second,  they  are  the  first  to  demonstrate  construction  of  a  spatial  mental  model  for  texts  that  describe  non*spatial 
domains  (e.g., the  temporal  order  of  steps  in  a  procedure). 

The  mental  models  account  has  not  gone  unchallenged.  McKoon  and  Ratcliff  (1992)  have  offered  a  new 
account  of  data  that  had  heretofore  been  regarded  as  among  the  strongest  demonstrating  mental  models.  I  will 
describe  that  original  data  (Glenberg,  Meyer,  and  Lindem,  1987)  first,  then  the  McKoon  and  Ratcliff  alternative, 
and  finally,  our  response  (Glenberg  and  Mathew,  1992).  In  the  Glenberg  et  al.  (1987)  experiments,  subject  read 
brief  descriptions  that  included  one  of  two  versions  of  a  critical  sentence.  In  the  associated  version,  the  critical 
sentence  associated  a  main  actor  (e.g.,  John)  and  a  target  object  (e.g.  sweatshirt).  An  example  is,  "After  warming 
up,  John  put  on  his  sweatshirt  and  jogged  halfway  around  the  lake."  In  the  dissociated  version,  the  critical 
sentence  dissociated  the  main  actor  and  the  target  object,  as  in  "After  warming  up,  John  took  off  his  sweaLshirt 
and  jogged  halfway  around  the  lake."  Following  the  critical  sentence,  the  main  actor  was  kept  foregrounded  by 
pronominal  reference,  whereas  the  target  was  never  again  mentioned  or  referred  to.  Availability  of  the  target 
object  was  tracked  by  how  quickly  subjects  could  recognize  the  target  object  (sweatshirt)  when  pre.sentation  of  the 
the  word  naming  the  target  interrupted  reading.  If  subjects  were  representing  the  text,  then  responding  to  the 
target  should  be  equally  fast  in  the  associated  and  dissociated  conditions:  in  both  conditions  the  target  object  is 
mentioned  once  in  relation  to  the  main  actor.  If  subjects  were  representing  a  mental  model  of  what  the  text  is 
about,  the  prediction  is  quite  different.  In  the  mental  model  formed  in  the  associated  condition,  the  main  actor  and 
the  target  object  are  spatially  related.  Hence  when  the  main  actor  is  foregrounded,  it  is  likely  that  the  target  is  too. 
In  the  mental  model  formed  in  the  dissociated  condition.the  main  actor  and  the  target  object  are  spatially 
dissociated.  Hence  when  the  main  actor  is  foregrounded,  it  is  less  likely  that  the  target  object  is  foregrounded. 
Thus,  the  prediction  based  on  the  mental  model  account  is  faster  responding  to  the  target  in  the  as.sociated 
condition  than  in  the  dissociated  condition.  This  is  ju.st  what  Glenberg  et  al.  (1987)  found. 

McKoon  and  Ratcliff  (1992)  .suggested  an  alternative  account  of  these  data  based  on  the  notion  of  salience. 
The  idea  is  that  by  virtue  of  association  with  the  main  character,  salience  is  conferred  on  the  target  item  (and  the 
propositions  derived  from  the  sentence).  According  to  this  alternative,  the  faster  responding  in  the  associated 
condition  does  not  reflect  construction  of  a  spatial  mental  model,  instead  it  reflects  salience  or  enhanced 
accessibility  conferred  by  association.  To  test  this  idea,  McKoon  and  Ratcliff  added  to  the  texts  a  location  for  the 
critical  target  item.  Thus,  John  might  be  described  as  putting  on  a  .sweatshirt  from  the  laundry  (the  location). 
Because  the  location  is  part  of  the  associated  (or  dis.sociated)  context,  salience  is  also  conferred  on  the  location, 
and  hence  it  should  be  responded  to  more  quickly  in  the  associated  condition  relative  to  the  dissociated  condition. 
The  prediction  from  the  mental  model  account  is  different.  Because  (according  to  McKoon  and  Ratcliff)  the 
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location  is  always  spatially  distant  from  the  main  actor,  responding  to  the  location  should  not  vary  as  a  function  of 
the  associated/dissociated  status  of  the  target  object  (sweatshirt).  In  fact,  McKoon  and  Ratcliff  found  that  both  the 
target  object  and  its  inidal  location  were  affected  by  the  associated/dissociated  variable,  as  predicted  from  the 
salience  account 

Glenberg  and  Mathew  (1992)  discussed  several  logical  problems  with  the  salience  account.  They  also 
reported  two  experimental  tests  of  the  account.  First,  they  asked  subjects  to  rate  the  salience  (defined  as 
importance  in  the  text)  of  both  the  cridcal  object  and  the  inidal  location  in  the  associated  and  dissociated  versions 
of  the  texts.  The  salience  account  predicts  higher  radngs  of  both  the  object  and  the  location  in  the  associated 
condition  compared  to  the  dissociated  condition.  For  the  object,  rated  salience  increased  some  (but  not  a 
stadstically  significant  amount)  from  the  dissociated  to  the  associated  condition.  For  the  location,  rated  salience 
decreased  substantially  from  the  dissociated  to  the  associated  condition,  just  the  opposite  of  the  prediction  from  the 
salience  account. 

If  salience  does  not  control  responding  in  this  task,  what  produced  McKoon  and  Ratcliffs  original  finding':* 
In  preparing  the  texts  for  the  rated  salience  experiment,  we  discovered  that  about  25%  of  McKoon  and  Ratcliff  s 
texts  did  not  clearly  meet  their  stated  condition  that  the  initial  location  and  the  main  actor  were  always  spatially 
dissociated.  For  example,  in  one  text,  the  main  actor  is  a  fisherman,  the  object  is  a  bag  of  chips  that  blows  into 
the  boat  (associated)  or  into  the  water  (dissociated)  and  the  initial  location  of  the  chips  is  the  fisherman's  hand. 
Clearly,  the  hand  is  not  dissociated  from  the  fisherman.  When  the  contribution  of  these  texts  is  removed  from 
McKoon  and  Ratcliffs  data,  the  effect  of  the  associated/dissociated  variable  greatly  decreases  for  the  initial 
location  and  is  slightly  enhanced  for  the  object  (Glenberg  &  Mathew,  1992).  Furthermore,  Glenberg  and  Mathew 
replicated  the  McKoon  and  Ratcliff  experiment  using  carefully  constructed  texts,  and  they  found  no  effect  of  the 
associated/dissociated  variable  on  the  locations,  but  a  substantial  affect  on  the  objects.  In  conclusion,  support  for 
the  salience  account  appears  to  derive  from  an  experimental  artifact  When  that  artifact  is  removed,  the  salience 
account  receives  no  support,  and  the  mental  model  account  continues  to  be  supported. 

Overall,  the  data  are  quite  convincing  that  people  construct  spatial  mental  models  while  comprehending  (in 
many  circumstances),  and  that  pictures  facilitate  the  construction  of  such  models.  However,  data  from  Glenberg 
and  Kruley  (1992)  demonstrate  that  pictures  can  have  additional  facilitative  effects.  Those  experiments  (described 
in  more  detail  below)  were  designed  to  uncover  some  of  the  ways  in  which  pictures  affect  on-line  processing  of 
texts.  In  one  experiment,  subjects  read  texts  alone,  texts  that  were  accompanied  by  a  picture,  and  texts  that  were 
followed  by  a  picture.  Performance  on  a  comprehension  task  was  best  when  pictures  accompanied  the  text. 
However,  when  pictures  followed  the  texts  (and  thus  were  unlikely  to  have  much  of  an  influence  on  the 
representation  of  the  information  derived  from  the  text)  there  was  nonetheless  a  significant  improvement  in 
comprehension  perfonnance.  These  data  point  to  an  additional  role  for  pictures,  one  that  is  consistent  with  the 
dual  code  theory.  We  are  in  the  process  of  tracking  down  this  additional  role,  but  we  have  as  yet  little  to  report. 

We  have  also  made  progress  toward  the  second  objective,  identifying  how  pictures  modify  the  processing  of 
textual  information.  Our  first  efforts  were  directed  at  demonstrating  that  pictures  may  facilitate  an  important 
component  process  of  comprehension,  finding  the  antecedents  of  anaphors  (Glenberg  and  Kruley.  1992). 

Subjects  read  texts  requiring  resolution  of  anaphors  for  which  the  antecedents  were  presented  shortly  before  or 
well  before  presentation  of  the  anaphor.  We  know  from  other  research  that  increasing  the  time  between  the 
antecedent  and  the  anaphor  slows  down  and  complicates  anaphor  resolution.  An  orthogonal  manipulation  was 
that  the  texts  were  accompanied  by  pictures  or  not.  We  thought  that  pictures  might  facilitate  anaphor  resolution 
because  the  anaphors  that  we  used  were  terms  describing  the  spatial  locations  of  pans  of  objects  (e.g.,  "the  part 
on  the  top").  Thus,  when  anaphor  resolution  was  difficult  (when  the  antecedent  was  presented  well  before  the 
anaphor),  we  expected  the  picture  to  have  a  substantial  benefit.  Rather  than  having  to  .search  through  a  mnemonic 
repre.sentation  of  the  text  for  an  appropriate  antecedent,  the  reader  could  simply  refer  to  the  mentioned  spatial 
location  in  the  picture.  However,  when  anaphor  resolution  was  easy  (when  the  antecedent  was  presented  shortly 
before  the  anaphor)  we  expected  little  benefit  of  the  picture.  In  this  case,  the  required  antecedent  should  be  highly 
available  in  memory.  The  results  across  .several  experiments  were  very  consistent.  Antecedent  distance  affected 
proce.ssing  and  pictures  affected  processing,  but  contrary  to  our  expectations,  the  two  did  not  interact.  Thus, 
pictures  do  not  seem  to  enhance  difficult  anaphor  resolution. 
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We  believe  that  pictures  help  people  to  derive  mental  models  (Glenberg  &  Langston,  1992).  and  we  believe 
that  these  mental  models  are  constructed  using  the  visual/spatial  sketchpad  (Baddeley,  1992)  of  working  memory 
(the  details  of  this  proposal  are  discussed  in  reference  to  Langston  and  Glenberg,  in  preparation,  and  Glenberg. 
Kruley,  and  Langston,  in  press,  both  of  which  are  described  later).  Glenberg,  Kruley,  and  Sciama  (in 
preparation)  pinpointed  the  use  of  the  visual/spatial  sketchpad  in  comprehension  of  text  accompanied  by  pictures. 
Those  experiments  used  the  dual-task  methodology  championed  by  Baddeley  to  uncover  those  aspects  of  working 
memory  utilized  by  a  particular  task.  Suppose  that  comprehension  of  texts  with  pictures  does  require  the 
sketchpad.  In  that  case,  concurrently  performed  tasks  which  also  require  the  sketchpad  will  be  perfonned  less 
well.  To  test  this  prediction,  we  developed  a  task  that  we  believe  taps  the  sketchpad,  a  short  term  dot  memory' 
task.  Subjects  in  the  experimental  version  of  the  task  were  briefly  shown  an  array  of  five  dots  in  a  grid  (usually  7 
X  7).  At  a  later  time,  a  test  array  was  shown,  and  subjects  determined  if  the  test  grid  matched  the  study  array. 

We  also  developed  a  control  condition  that  required  similar  perceptual  and  response  processes,  but  had  no 
memory  requirement.  In  the  control  condition,  the  study  array  was  an  empty  grid.  When  the  test  array  was 
shown,  the  subject  responded  as  to  whether  a  majority  of  the  dots  were  above  the  center  line  of  the  grid. 

In  Experiments  1  and  2,  the  subjects  comprehended  texts  presented  with  or  without  pictures.  Each  text  was 
divided  into  eight  segments,  each  segment  consisting  of  one  or  more  sentences.  Each  text  segment  was  preceded 
by  a  study  array  and  followed  by  a  test  array.  We  expected  text  comprehension  to  be  facilitated  by  the  pictures 
(and  it  was),  and  we  expected  comprehension  to  suffer  when  the  experimental  version  of  the  dot  task  was  used 
(and  it  did).  More  importantly  was  whether  or  not  performance  on  the  experimental  version  of  the  concurrent  task 
was  selectively  disrupted  when  subjects  comprehended  texts  presented  with  pictures.  In  fact  we  found  this 
selective  disruption.  Although  small  numerically  (about  a  five  percent  reduction)  it  was  easily  statistically 
significant  in  two  experiments. 

These  results  are  suggestive  of  the  use  of  the  sketchpad  for  comprehension  of  texts  with  pictures.  However, 
an  alternative  explanation  for  the  findings  is  that  presentation  of  pictures  (which  occurred  in  the  interval  between 
the  study  and  test  arrays)  simply  disrupted  memory  of  the  study  array  by  distraction,  perceptual  processing,  or 
interference  with  a  long-term  code.  To  demonstrate  that  the  disruption  only  occurs  during  comprehension,  in 
Experiment  3  we  replicated  Experiments  1  and  2.  but  removed  the  requirement  to  comprehend  the  texts.  Thus, 
although  the  pictures  and  texts  were  presented,  the  subjects  were  instructed  that  they  would  never  be  tested  on  the 
texts.  Several  weeks  after  participating  in  the  experiment,  subjects  were  invited  back  to  the  laboratory  to  take  a 
recognition  test  for  the  pictures. 

The  primary  finding  from  Experiment  3  was  that  the  experimental  version  of  the  concurrent  task  was  not 
disrupted  by  presentation  of  the  texts  with  pictures.  That  is,  the  di.sruption  only  occurs  w  hen  there  is  a 
requirement  to  comprehend  the  texts,  as  in  Experiments  I  and  2.  This  null  effect  is  unlikely  to  be  the  result  of  low 
power.  In  fact,  the  experiment  had  a  power  of  .99  to  detect  the  smallest  disruption  found  in  the  initial 
experiments.  Also,  the  absence  of  disruption  was  not  due  to  subjects  blocking  the  pictures  (e.g.,  by  looking 
away).  Recognition  memory  for  the  pictures  was  quite  good. 

One  final  piece  of  data  is  required  to  be  confident  that  these  results  reflect  use  of  the  sketchpad  w  hen 
comprehending  texts  with  pictures.  We  mu.st  demonstrate  that  the  selective  disruption  does  not  occur  when  the 
concurrent  task  is  not  tapping  the  sketchpad.  To  do  this,  for  Experiment  4  we  designed  two  version  of  a 
concurrent  task  that  should  tap  the  articulatory  loop  component  of  working  memory,  but  not  the  sketchpad.  In  the 
experimental  version  of  the  task,  subjects  studied  a  sub-span  sequence  of  digits.  At  the  test,  the  subjects  were 
presented  with  two  digits  from  the  sequence  and  re.sponded  as  to  whether  the  digits  were  in  the  same  order  as  in 
the  .studied  sequence.  The  control  version  of  the  task  did  not  require  memory.  Instead,  at  the  test,  subjects 
responded  as  to  whether  the  two  digits  formed  an  odd  or  even  number.  As  in  the  previous  experiments,  the 
.segments  of  each  text  were  embedded  between  a  study  and  te.st  sequence,  and  half  of  the  texts  were  accompanied 
by  pictures. 

Once  again,  we  expected  pictures  to  facilitate  comprehension  (and  they  did),  and  we  expected 
comprehension  to  suffer  in  the  experimental  version  of  the  digit  ta.sk  (and  it  did).  The  critical  question  is  whether 
or  not  performance  in  the  digit  task  is  selectively  disrupted  by  comprehending  texts  with  pictures.  We  did  not  find 
this  selective  disruption.  Thus,  when  the  concurrent  task  taps  the  sketchpad,  perfonnance  on  the  task  is  disrupted 
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by  comprehending  texts  with  pictures  (Experiments  1  and  2).  When  the  concurrent  task  does  not  tap  il^e 
sketchpad,  performance  on  the  task  is  not  disrupted  by  comprehending  texts  with  pictures  (Experiment  4).  Thus, 
the  data  point  consistently  to  this  conclusion:  Comprehension  of  texts  with  pictures  requires  the  visual/spatial 
sketchpad  of  working  memory. 

How  is  the  sketchpad  being  used  during  comprehension?  We  believe  that  subjects  are  using  it  to  build 
spatial  mental  nnodels.  Experiments  reported  in  Langston  and  Glenberg  (in  preparation)  support  the  claim  that  the 
models  are  spatial  not  only  in  what  they  represent,  but  also  in  the  nature  of  die  representation.  Funhermore,  these 
experiments  demonstrate  a  functional  characteristic  of  spatial  mental  models  that  we  call  noticing.  Noticing  is  a 
process  by  which  a  reader  can  infer  (from  the  spatial  model)  a  relation  that  is  not  presented  in  the  text.  We  believe 
that  there  are  several  preconditions  for  noticing.  First,  comprehenders  must  be  building  a  spatial  mental  model. 
That  is,  they  must  be  using  a  spatial  dimension  of  the  sketchpad  to  represent  a  text-relevant  dimension  (e.g., 
temporal  order  of  steps  in  a  procedure).  Second,  attention  must  be  focused  on  at  least  one  entity  (representation  of 
an  object)  in  the  model.  Third,  when  the  focused  entity  is  juxtaposed  with  another  entity,  the  comprehender 
notices  the  spatial  relation  between  the  entities  and  encodes  this  as  an  inference  along  the  text-relevant  dimension. 

Noticing  has  several  characteristics  that  recommend  it  as  a  procedure  for  generating  inferences.  First,  the 
inferential  process  is  constrained  in  several  ways,  thus  preventing  an  inferential  explosion.  One  constraint  is  due 
to  the  limited  capacity  of  the  sketchpad.  Another  constraint  is  that  noticing  only  occurs  amongst  attended  entities. 
Second,  noticing  results  is  a  more  complete  or  elaborated  understanding  of  the  text  than  that  that  could  be  derived 
from  the  text  alone. 

In  the  Langston  and  Glenberg  (in  preparation)  experiments,  subjects  read  texts  describing  a  spatial  layout. 

An  example  is  given  in  Table  1.  The  layout  is  illustrate  in  Figure  1.  Note  that  the  last  .sentence  of  the  text  can  be 
presented  in  either  of  two  forms.  In  one  form,  the  last  object  (eggs)  is  described  in  a  location  that,  if  an  accurate 
spatial  representation  is  being  created,  will  be  adjacent  to  the  location  of  the  first  object  (flour).  We  call  this  the 
notice  position,  because  subjects  should  be  able  to  notice  the  relation  between  the  first  and  last  objects. 
Alternatively,  the  last  sentence  of  the  text  can  describe  the  location  of  the  last  object  that  happens  to  be  distant  from 
the  first  object.  We  call  this  the  not  notice  position.  After  the  last  sentence  is  presented,  we  probe  for  availability 
of  the  first  object  (flour)  using  a  speeded  recognition  test.  If  subjects  are  creating  accurate  spatial  models  and 
noticing  when  the  last  object  is  in  the  notice  position,  then  availability  of  the  target  (the  first  object)  should  be 
greater  (and  responding  faster)  than  when  the  last  object  is  in  the  not  notice  position. 

We  manipulated  two  additional  variables  to  help  pinpoint  the  nature  of  the  putative  noticing  process.  First, 
we  manipulated  the  number  of  objects  described  before  testing  the  availability  of  the  target.  Table  1  and  Figure  1 
illustrate  the  six-item  condition.  We  also  used  a  four-item  condition.  Our  account  of  noticing  is  that  it  occurs  in 
the  limited  capacity  sketchpad  of  working  memory.  Given  the  capacity  constraints,  noticing  may  not  occur  in  the 
six  item  condition  because  the  target  may  have  been  lost  from  the  model  before  the  sixth  item  is  presented. 

Second,  we  manipulated  whether  or  not  the  text  was  accompanied  by  pictures.  When  pictures  were  presented, 
only  the  first  three  items  accompanied  the  texts.  Thus  any  noticing  required  construction  of  a  spatial  model,  not 
just  examination  of  the  picture.  If  pictures  help  readers  to  build  and  maintain  spatial  mental  models,  we  might 
expect  noticing  to  occur  for  both  the  four  and  six  item  texts  when  they  are  accompanied  by  pictures. 

The  data  are  in  Figure  2.  The  noticing  effect  is  revealed  by  the  slope  from  left  to  right,  that  is,  faster 
responding  when  the  last  item  is  in  the  notice  position  than  when  it  is  in  the  not  notice  position.  When  pictures 
accompany  the  texts  (upper  panel),  the  noticing  effect  is  found  for  both  the  six-item  texts  and  the  four-item  texts. 
Now  turn  to  the  lower  panel,  which  illustrates  the  results  when  pictures  did  not  accompany  the  texts.  Now,  a 
noticing  effect  is  found  for  the  four-item  texts,  but  not  for  the  six-item  texts.  Apparently,  the  six-item  texts  (when 
presented  without  a  picture)  overload  the  capacity  of  the  sketchpad. 

These  data  are  important  for  two  reasons.  First,  they  illustrate  the  noticing  processes.  Second,  the  data 
point  to  the  nature  of  the  representation  of  mental  models.  Remember  that  a  mental  model  is  a  representation  of 
what  the  text  is  about,  not  a  representation  of  the  text.  A  priori,  a  mental  model  may  have  many  formats,  e.g., 
propositional,  spatial,  distributed.  The  noticing  data,  however,  point  to  a  truly  spatial,  analogical  representation. 
Given  a  propositional  representation,  for  example,  there  is  no  reason  to  suspect  any  difference  in  availability  of  the 
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Table  1 

Example  of  a  text  used  in  Langston  and  Glenberg  (in  preparation) 

Mary  was  arranging  ingredients  on  the  counter  to  bake  a  cake. 

Mary  put  the  flour  down  first. 

Then  she  put  the  sugar  to  the  right  of  the  flour. 

Then  she  put  the  cocoa  to  the  right  of  the  sugar. 

Next  Mary  put  the  baking  soda  in  front  of  the  cocoa. 

Then  she  put  the  milk  to  the  left  of  the  baking  soda. 

Finally  she  put  the  eggs  (to  the  left  oO  (in  front  oO  the  milk. 

Note:  The  target  object  is  flour,  and  it  is  probed  after  the  last  sentence.  The  eggs  are  in  the  notice  position 
when  described  as  to  the  left  of  the  milk.  The  eggs  are  in  the  not  notice  position  when  described  in  front  of  the 
milk. 


Target 


flour 

sugar 

cocoa 

Notice 

Position 

eggs 

milk 

baking 

soda 

Not 

Notice 

eggs 

Position 


Figure  1 :  spatial  layout  corresponding  to  the  text  in  Table  1.  The  boxes  in  bold  correspond  to  items 
illustrated  (in  the  six-item  condition)  when  a  picture  accompanied  the  text. 
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Figure  2 

Data  from  Langston  &  Glenberg  (in  preparation) 


Condition 
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target  as  a  function  of  the  described  position  of  the  last  item.  Certainly  a  comprehender  could  apply  a  series  of 
tests  to  the  propositions  to  derive  the  fact  that  in  the  notice  condition  the  last  item  is  near  the  target.  However,  it  is 
difficult  to  understand  why  a  comprehender  would  ever  bother  to  do  so.  If  the  representation  of  the  mental  model 
is  created  in  a  spatial  analog  medium,  however,  then  the  noticing  effect  is  predicted.  That  is,  given  a  spatial 
representation  system  and  proper  interpretation  of  the  sentences,  the  last  item  must  be  adjacent  to  the  first  item  in 
the  notice  condition.  That  is,  because  of  intrinsic  constraints  on  the  spatial  medium,  the  last  item  and  the  target 
item  must  be  proximal. 

The  experimental  research  has  led  us  to  a  series  of  conclusions  about  the  construction  and  use  of  mental 
models  and  pictures  during  comprehension.  Tlie  various  conclusions,  however,  derive  from  separate  experiments 
using  a  variety  of  texts,  pictures,  and  methodologies.  To  what  extent  can  we  be  sure  that  all  of  the  processes  and 
mechanisms  can  work  together  in  a  single  coherent  system?  To  answer  this  question,  we  set  about  to  build  a 
computer  simulation  of  the  construction  of  mental  models.  The  simulation  is  far  from  being  finished,  but  a 
preliminary  report  is  given  in  Glenberg,  Kruley,  and  Langston  (in  press).  What  follows  is  abstracted  from  that 
report. 

In  outline,  the  simulation  constructs  propositions  from  (highly-coded)  words  in  sentences.  These 
propositions  are  used  to  construct  mental  models  and  to  direct  the  manipulation  of  the  models.  If  a  picture  is 
available,  it  is  used  to  guide  the  construction  of  the  model.  Once  constructed,  the  model  becomes  a  source  of  new 
information  about  the  situation. 

Nodes,  propositions,  and  the  mental  model 

One  component  of  the  simulation's  memory  is  the  node.  We  use  nodes  to  represent  specific  objects  (or 
more  generally,  entities),  as  opposed  to  classes,  and  each  object  described  by  a  text  has  a  corresponding  node. 

The  node  encodes  a  limited  amount  of  information  about  the  object  including  its  count  (singular  or  plural), 
animacy  (animate  or  inanimate),  gender  (male,  female,  or  neuter),  and  semantic  class.  Because  the  simulation 
does  not  have  a  permanent  knowledge  base,  this  information  must  be  hand-coded  into  the  "text"  that  the 
.simulation  processes. 

Five  types  of  propositions  are  derived  from  the  text.  Word  propositions  encode  verbatim  the  actual  word(,s) 
used  to  name  an  object.  The  proposition  consists  of  the  words  used,  and  a  pointer  to  the  node  named  by  the 
words.  Language  propositions  encode  some  linguistic  and  semantic  attributes  such  as  whether  the  word  is  the 
grammatical  subject  and  the  given/new  status,  and  they  also  include  a  pointer  to  the  node.  Although  these 
propositions  are  necessary  to  the  operation  of  the  model,  we  will  have  little  to  say  about  them  here.  Description 
propositions  encode  unary  attributes  of  objects  such  as  size  or  color,  and  they  also  include  a  pointer  to  the  node 
being  described.  Existence  propositions  encode  the  fact  that  an  object  exists.  Finally,  relational  propositions 
encode  relations  among  objects.  The  proposition  includes  a  specification  of  the  relation  (e.g.,  "attached")  and 
pointers  to  the  nodes  taking  part  in  the  relation.  One  of  the  pointers  is  designated  as  the  "focus"  of  the 
proposition.  The  focus  node  is  typically  the  grammatical  subject  (determined  from  the  language  proposition). 

Each  of  these  types  of  propositions  can  be  activated  by  various  sor.rces  (de.scribed  shortly).  Because 
activation  of  propositions  is  continuous  and  graded,  there  is  little  distinction  between  information  "in"  working 
memory  and  information  "in"  long-term  store.  Propositions  that  are  highly  activated  are  easy  to  retrieve,  whereas 
propositions  that  have  little  activation  are  difficult  to  retrieve.  Propositions  are  never  deleted  from  memory, 
however.  Nodes  are  not  directly  activated.  Instead,  the  activation  (availability)  of  a  node  is  given  by  the 
activation  of  all  of  the  propositions  that  point  to  that  node.  Thus,  nodes,  like  propositions,  are  available  to  a 
graded  degree. 

We  conceptualize  the  mental  model  as  being  constructed  in  a  three-dimensional  spatial  medium 
corresponding  to  the  visual/spatial  sketchpad  of  working  memory  (Baddeley,  1990).  The  mental  model  is 
extremely  limited  in  capacity  because  of  limitations  on  activation.  Entities  in  the  model  are  pointers  to  nodes. 
Distance  between  pointers  is  representationally  meaningful.  That  is,  pointers  that  are  closer  together  are  more 
strongly  related.  The  spatial  dimensions  ordinarily  correspond  to  up/down,  front/back,  and  left/right.  However, 
when  the  simulation  is  assumed  to  have  the  requisite  knowledge,  the  .spatial  dimensions  may  be  used  to  represent 
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other,  text-relevant  dimensions  such  as  time,  energy,  mass,  friendliness,  etc.  The  pointers  (entities  in  the  model ) 
can  be  activated  to  various  degrees,  and  when  a  pointer’s  activation  falls  below  a  threshold,  it  is  removed  from  the 
model.  Thus,  unlike  propositions  and  nodes,  pointers  are  temporary. 


Processing  in  the  simulation 

Processing  is  conQ-olled  by  a  working  memory  with  multiple  (but  relatively  fixed)  capacities  used  for 
different  tasks.  The  articulatory  capacity  is  used  to  activate  word  and  language  propositions.  The  spatial  capacity 
is  used  to  read  a  word,  examine  a  picture,  and  maintain  pointers  in  the  mental  model.  The  general  capacity  can  be 
deployed  to  suppon  any  of  the  activities  already  mentioned,  and  it  is  used  to  activate  relation  and  de.scription 
propositions.  In  addition,  general  capacity  is  used  to  support  cognitive  activities  such  as  retrieving  infomiation 
and  manipulating  the  pointers  in  the  ment^  model. 

Each  activity  (e.g.,  representing  a  pointer  in  the  mental  model,  retrieving  a  proposition)  requires  a  particular 
amount  and  type  of  capacity.  However,  the  capacities  are  strictly  limited  and  are  quickly  allocated.  When 
available  capacity  is  insufficient  for  an  activity,  capacity  is  recovered  from  memory  using  a  proportionality 
algorithm.  Each  element  in  memory  (mental  model  pointer  or  proposition)  gives  up  a  part  of  the  capacity  assigned 
to  it  proportional  to  the  total  amount  of  that  capacity  being  used.  This  algorithm  produces  negatively  accelerated 
forgetting  (decrease  in  retrievability)  and  an  extremely  interactive  system. 

Retrieval  is  based  on  a  resonance  metaphor,  much  like  the  Minerva  II  model  (Hintzman,  1986).  The  same 
retrieval  process  is  used  during  comprehension  (e.g.,  in  retrieving  antecedents  for  anaphors)  and  in  memory 
tasks.  In  outline,  retrieval  works  by  assembling  one  or  more  propositions  to  use  as  retrieval  cues.  Next, 
activation  is  recovered  proportional  to  the  number  of  propositions  used  as  cues.  The  cues  are  compared  to  all 
propositions  in  memory,  and  the  activation  of  those  propositions  is  increased  in  direct  proportion  to  their  current 
activation  (recency),  in  direct  proportion  to  their  similarity  to  the  cue  (encoding  specificity),  and  in  inverse 
proportion  to  the  number  of  propositions  contacted  by  the  cue  (cue  overload,  or  fan  effects).  The  result  of  a 
retrieval  operation  is  a  redistribution  of  (the  previously  recovered)  activation  across  the  propositions  and  a 
consequent  change  in  the  availability  of  the  nodes  pointed  to  by  the  propositions. 

Whenever  a  relation  or  existence  proposition  is  constructed,  the  simulation  treats  the  proposition  as  a 
direction  to  update  the  mental  model.  This  updating  involves  several  major  steps.  First,  activation  is  recovered  to 
drive  the  following  steps.  Then,  appropriate  pointers  are  inserted  into  the  model,  if  they  are  not  there  already. 
Next,  if  the  proposition  describes  a  relation  between  pointers  that  is  not  extant  in  the  model,  the  pointers  are 
moved  into  that  relation.  If  a  picture  of  the  situation  is  available,  that  picture  is  used  to  help  construct  the  model. 
For  example,  the  text  might  describe  Object  A  as  near  to  Object  B.  Given  a  picture,  the  mental  model  would  be 
able  to  represent  whether  Object  A  is  to  the  left  or  right  of  Object  B. 

Finally,  the  simulation  learns  from  the  mental  model  using  a  process  we  call  noticing.  After  the  mental 
model  is  manipulated,  the  simulation  searches  for  all  pointers  within  the  "noticing  radius"  of  the  pointer  that  was 
manipulated.  If  such  a  pointer  is  found,  the  simulation  notices,  that  is,  generates  a  proposition  describing  the 
relation  between  the  manipulated  pointer  and  the  found  pointer.  These  noticed  propositions  are  supported  by 
general  capacity  and  stored  in  memory  with  pointers  to  the  relevant  object  nodes.  Prototypically,  the  noticed 
relation  is  spatial  (e.g.,  "left  of),  but  the  interpretation  of  the  relation  depends  on  the  domain-relevant  dimension 
assigned  to  the  spatial  dimension.  By  virtue  of  noticing,  the  simulation  infers  information  that  is  not  explicit  in 
the  text  and  learns  from  manipulation  of  its  own  mental  model. 

Reading  a  simple,  one  proposition  SVO  sentence  proceeds  as  follows.  (Each  step  requires  that  a  sufficient 
amount  of  capacity  be  available  or  be  recovered.  Discussion  of  this  is  suppressed  for  clarity.)  First,  the  subject 
noun  is  read  and  represented  verbatim  using  articulatoiy  capacity.  If  the  word  is  marked  as  "new,"  a  new  node  is 
generated  to  represent  the  specific  object.  If  the  word  is  marked  as  "given,"  a  search  of  memory  is  conducted  for 
a  possible  referent;  if  none  is  found,  a  new  node  is  generated.  The  search  (using  general  capacity)  uses  the 
retrieval  algorithm,  and  the  cues  consist  of  information  available  about  the  word  (e.g.,  gender).  When  the  verb  is 
read,  it  is  encoded  as  a  relation  of  a  proposition,  and  memory  is  searched  (using  the  retrieval  algorithm)  for  an 
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appropriate  initial  argument  (e.g.,  one  that  agrees  in  number  with  the  verb).  When  the  object  noun  is  read,  it  is 
represented  verbatim,  and  a  node  is  retrieved  or  created  for  it  Then,  the  developing  relation  proposition  is 
retrieved  and  completed. 

When  the  completed  proposition  specifies  a  relation  represented  by  one  of  the  dimensions  in  the  mental 
model,  the  proposition  is  treated  as  an  instruction  to  update  the  model  (using  spatial  capacity).  First,  however,  if  a 
picture  is  available,  the  picture  is  searched  for  the  arguments  of  the  proposition.  If  the  arguments  are  found  in  the 
picture,  then  the  spatial  layout  of  the  picture  controls  where  the  pointers  to  the  objects  are  placed  in  the  mental 
model. 

Once  the  mental  model  has  been  updated,  noticing  occurs  within  the  noticing  radius  of  the  pointer 
corresponding  to  the  focus  of  the  proposition.  Any  noticed  relations  are  encoded  propositionally,  and  stored  with 
the  appropriate  nodes. 

Because  the  simulation  respects  work  on  memory,  it  can  successfully  simulate  standard  findings  such  as 
recency  effects  and  long-term  recency  effects  (Glenberg,  Bradley,  Kraus,  &  Renzaglia,  1983)  due  to  changes  in 
capacities  devoted  to  various  propositions;  proactive  interference  and  release  from  proactive  interference  due  to  cue 
overload  (Watkins  &  Watkins,  1975),  as  well  as  some  rather  unusual  new  findings  such  as  the  revelation  effect 
(Watkins  &  Peynircioglu,  1990).  Also,  the  simulation  has  had  success  in  simulating  work  on  mental  models 
(Glenberg,  et  al.,  1987),  effects  of  pictures  on  comprehension  (Glenberg  &  Langston,  in  press),  map-learning 
(McNamara,  Halpin,  &  Hardy,  in  press),  as  well  as  retrieval  of  antecedents  for  anaphors  (O'Brien,  Plewes,  & 
Albrecht,  1990). 

Simulation  and  data 

To  provide  a  sense  of  how  the  simulation  works,  we  will  describe  in  more  detail  how  it  deals  with  two 
phenomena,  effects  of  mental  models  on  foregrounding  (Glenberg,  et  al.,  1987),  and  learning  of  cognitive  maps 
(McNamara,  et  al.,  in  press).  The  simulation  of  Glenberg  et  al.  (1987)  illustrates  how  mental  models  are 
constructed  from  texts  and  how  the  model  can  influence  comprehension  processes.  The  simulation  of  the 
McNamara  et  al.  results  illustrates  how  pictorial  information  can  be  used  to  help  construct  mental  models,  and  it 
illustrates  the  operation  of  the  "noticing"  process. 

The  point  of  the  Glenberg  et  al.  (1987)  experiment  was  to  demonstrate  that  the  structure  of  the  situation  (as 
opposed  to  the  structure  of  the  text)  plays  an  important  role  in  foregrounding.  When  we  say  that  a  concept  is 
foregrounded,  we  mean  that  the  concept  is  readily  available  and  thus  easy  to  refer  to,  especially  by  a  pronoun.  To 
demonstrate  that  the  structure  of  the  situation  influences  foregrounding,  Glenberg  et  al.  (1987)  showed  that  a 
critical  object  (e.g.,  sweatshirt)  that  is  spatially  associated  with  a  main  actor  tends  to  remain  foregrounded  longer 
than  a  critical  object  that  is  spatially  dissociated  with  the  main  actor.  Figure  3  shows  a  comparison  of  the 
Glenberg  et  al.  (1987)  results  and  the  results  from  the  simulation.  jTie  dependent  variable  for  the  simulation  is  a 
transformation  of  the  activation  of  the  critical  object  node  so  that  it  can  be  more  easily  compared  to  reaction  time. 

The  text  used  in  the  simulation  is  given  in  Table  2.  It  is  a  simplified  version  of  the  text  used  in  Glenberg  et 
al.  (1987).  Figure  4  portrays  the  situation  in  the  simulation’s  memory  immediately  after  reading  the  a.ssociat^ 
sentence.  We  will  first  describe  how  the  simulation  got  into  the  state  illustrated  in  Figure  4,  and  then  we  will 
describe  differences  between  the  associated  and  dissociated  conditions  from  that  point  on. 

Upon  starting  a  new  sentence,  the  simulation  captures  enough  activation  to  process  a  typical,  simple 
sentence,  namely  the  activation  needed  to  encode  a  subject,  an  object,  and  a  relation.  The  simulation  then  reads  the 
word  "John"  hand-coded  along  a  number  of  dimensions:  number  (singular),  animacy  (animate),  gender  (male), 
given-new  (given-unless  concepts  are  specifically  marked  as  new  [e.  g.,  by  use  of  an  indefinite  article],  they  are 
treated  as  given),  and  grammatical  class  (subject).  In  addition,  a  semantic  code  is  assigned  to  John  that  repre.sents 
categorical  information.  These  arbitrary  semantic  codes  provide  a  way  of  assigning  entities  either  to  the  same  or  to 
different  categories.  Because  of  the  semantic  code,  the  simulation  does  not  confuse  "John"  with  "Fido,"  which 
would  otherwise  be  coded  identically. 


Response  Time 


Final  Technical  Report 
7/1/89  through  1 1/30/92 


Page  1 1 


AFOSR-X9-0367 


Figure  3.  The  data  on  the  left  are  from  Glenberg  et  al.  (1987).  On  the  right  are  data  from  the 
simulation  of  that  experiment  as  described  in  the  text. 


Data  from  Glenberg  el  al.  (1987)  Simulation  of  Glenberg  et  al.  (1987) 


Delay  (Sentences) 


Delay  (Sentences) 


Final  Technical  Report 
7/1/89  through  11/30/92 


Page  12 


AFOSR-H9.0367 


Table  2 

Text  used  in  the  simulation  of  Glenberg,  Meyer  and  Lindem  (1987) 
corresponding  to  the  associated  and  filler  sentences  in  Table  1 

Critical  (associated)  John  put  on  a  white  sweatshirt. 

Filler  John  ran  to  the  lake. 

Filler  John  has  muscles. 


Figure  4.  Relevant  aspects  of  the  simulation  after  it  has  processed  the  associated  ..entence  in  Table  3.  The  top 
portion  illustrates  two  dimensions  of  the  three-dimensional  mental  model.  Symbols  in  angle  brackets 
are  pointers  to  object  nodes.  The  labels  PI,  P2,  etc.  refer  to  propositions  derived  from  the  text.  The 
widths  of  the  symbols  within  the  square  brackets  correspond  to  the  amount  of  spatial  (ellipses),  genera! 
(open  rectangles)  and  articulatory  (filled  rectangles)  capacity  devoted  to  each  proposition  and  to  each 
pointer  in  the  mental  model. 


Mental  Model 


•  John  •  SwS 


Pli 

0  1  ]  Word:  <John>,  "John 

P2[ 

1  ]  Rel:  <John>, 

attached  to,  <SwS> 

P5{ 

□  1  ]  Word:  <John>,  "John' 

P6[ 

■  ]  Rel:  <John>,  runs  to, 

<Lake> 

Q  ]  Word:  <SwS> 
"White  sweatshirt" 

P4[  ]  Description: 

<SwS>,  White 


•  Lake 


P7[  UB  1  Word;  <Lake>,  "Lake' 
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The  simulation  uses  the  coded  information  about  "John"  to  search  memory  for  any  nodes  to  v^hich  the  word 
"John"  may  refer.  Because  this  is  the  first  word  of  the  text,  none  is  found,  and  so  a  new  node  is  created.  The 
node  includes  the  information  that  John  is  singular,  animate,  male,  and  the  semantic  code.  A  proposition  (PI .  in 
Figure  4)  is  formed  that  indicates  that  the  word  "John"  was  used  to  refer  to  this  node,  and  the  first  argument  in  the 
proposition  is  a  pointer  to  the  node  John  (<John>).  This  proposition  is  supponed  with  both  articulatory  and 
general  activation  front  the  arnount  reserved  at  the  beginning  of  the  sentence. 

The  simulation  then  reads  "puts  on."  This  is  coded  as  the  relation  "attached  to,"  along  with  the  information 
that  a  singular  active  subject  is  required  for  this  relation.  A  relation  proposition  is  created  (P2),  but  at  this  t  me 
only  the  relation  "attached  to"  is  specified.  The  arguments  of  the  proposition  must  be  either  retrieved  or  read.  The 
simulation  uses  the  conditions  "singular '  and  "active"  to  attempt  to  retrieve  nodes  that  match  the  conditions 
.specified  by  the  coding  of  "puts  on."  The  node  John  will  be  found  (if  there  has  not  been  interfering  activity  such 
as  a  long  phrase  in  between  the  reading  of  "John"  and  "puts  on").  The  retrieval  process  boosts  the  activation  of 
the  word  proposition  (PI)  and  thereby  the  node  corresponding  to  John.  A  pointer  to  the  node  becomes  the  focus 
of  proposition  P2,  and  this  proposition  is  supported  by  general  activation  from  that  reserved  at  the  beginning  of 
the  sentence.  General  activation  is  used,  rather  than  articulatory  activation,  because  the  proposition  does  net 
correspond  directly  to  anything  that  can  be  articulated  in  an  articulatory  loop.  Note  that  the  proposition  is  not  yet 
completed  because  what  John  is  attached  to  has  not  yet  been  read.  The  incomplete  proposition  is  given  an  extra 
boost  of  activation  so  that  it  is  not  inadvertently  lost  before  the  proposition  can  be  completed. 

The  simulation  then  reads  "a  white  sweatshirt."  Sweatshirt  is  coded  as  singular,  new,  inanimate,  neuter, 
grammatical  object,  and  given  a  semantic  code.  Because  sweatshirt  is  coded  as  "new"  (based  on  the  indefinite 
"a")  no  search  is  conducted  for  a  matching  node.  Instead,  the  Sws  node  is  created.  The  fact  that  the  words  "white 
sweatshirt"  were  used  to  refer  to  the  node  is  encoded  by  a  word  propo.sition,  P3.  In  addition,  descriptive 
information  about  this  node,  in  particular  that  the  object  is  white,  is  encoded  by  a  description  proposition  (P4). 
Both  of  these  propositions  are  supported  the  activation  reserved  at  the  beginning  of  the  sentence. 

Becau.se  sweatshirt  was  coded  as  a  grammatical  object,  the  simulation  searches  for  an  incomplete  relation 
proposition,  and  finds  P2.  A  pointer  to  Sws  is  inserted  into  P2.  and  any  extra  activation  used  to  keep  P2  from 
being  forgotten  (before  it  was  completed)  is  reduced. 

After  a  relation  proposition  is  completed,  the  simulation  determines  if  that  proposition  has  any  implications 
for  the  situation  (mental  model)  that  is  being  represented.  In  this  case,  pointers  to  the  John  and  Sws  nodes  are 
introduced  into  the  mental  model  in  close  proximity.  These  pointers  are  supported  by  a  combination  of  spatial  and 
general  activation. 

Suppose  that  the  test  probe  "sweatshirt"  is  presented  at  this  tiaie.  Responding  to  this  probe  will  be  quick 
and  accurate  because  the  Sws  node  is  highly  activated.  Note  that  this  is  the  case  in  both  the  as.sociated  and  the 
dissociated  conditions  (see  below)  because  information  about  the  sweatshirt  has  just  been  encoded  in  both  cases. 

Before  the  next  sentence  is  attempted,  the  simulation  again  reserves  activation.  Because  total  activation  in 
the  system  is  limited,  this  reduces  or  "suppresses"  information  from  the  previous  sentence.  Upon  reading  the 
coded  version  of  John,  a  search  is  initiated  and  the  John  node  is  found.  A  new  proposition,  encoding  the  fact  that 
the  specific  word  "John"  was  used  again,  is  encoded  (P5).  This  search  process  will  have  increa,sed  the  activation 
of  PI  (because  the  retrieval  cue  matches  the  proposition)  and  decreased  the  activation  of  other  propositions,  such 
as  P3,  The  words  "ran  to"  are  encoded  as  the  relation  "moves  to,"  and  the  incomplete  proposition  (P6)  is 
supported  by  extra  activation,  until  it  is  completed. 

Because  "the  lake"  is  coded  as  "given"  (based  on  the  use  of  the  definite  article)  a  search  for  a  compatible 
node  is  initiated.  When  none  is  found,  a  new  node  is  created,  and  the  proposition  encoding  the  word  "lake”  is 
formed  (P7).  Becau.se  lake  is  coded  as  a  grammatical  object,  the  simulation  attempts  to  retrieve  an  incomplete 
proposition  (P6).  When  P6  is  retrieved,  a  pointer  to  the  lake  node  is  added  to  it.  All  of  this  processing  has 
greatly  reduced  the  activation  of  propositions  pointing  to  the  Sws  node. 
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At  this  point,  some  interesting  processing  occurs.  With  the  completion  of  the  proposition,  tiie  menial  model 
is  updated.  A  pointer  to  the  lake  node  is  entered  into  the  mental  model,  and  the  simulation  attempts  lo  manipulate 
the  mental  model  to  be  consistent  with  the  recently  encoded  proposition,  that  John  moves  to  the  lake.  In  preparing 
to  move  the  pointer  representing  John,  the  simulation  notices  that  a  pointer  to  Sws  is  very  near  to  John.  Should 
that  pointer  be  moved  too?  The  mental  model  does  not  represent  (directly)  the  fact  that  John  and  Sw  s  are 
attached,  only  that  they  are  spatially  clo.se.  The  information  that  John  is  attached  to  his  sweatshin  is  given  only  in 
the  propositions.  Thus,  the  simulation  attempts  to  retrieve  information  relating  John  and  Sws  by  using  as  a 
retrieval  cue  a  proposition  consisting  of  a  pointer  to  John,  a  pointer  to  Sws,  and  the  relation  "attached  to.  '  If  a 
corresponding  proposition  can  be  retrieved,  then  both  pointers  in  the  model  will  be  moved  to  the  lake.  In  fact,  the 
simulation  is  successful  in  retrieving  P2,  and  both  pointers  are  moved.  The.se  processes  increase  activation  of  the 
Sws  node  in  several  ways.  First,  activation  of  the  Sws  node  is  enhanced  because  retrieval  of  P2  increases  the 
activation  of  P2  (and  hence  the  activation  of  the  Sws  node).  Second,  manipulating  the  pointer  to  Sw  s  in  the 
mental  model  enhances  the  pointer's  activation  (and  hence  the  activation  of  the  Sws  node).  If  the  test  probe 
"sweatshin"  is  presented  at  this  time,  responding  will  be  relatively  quick.  That  is,  retrieval  of  the  Sw  s  node  w  ill 
be  facile  because  it  is  highly  activated. 

When  the  dissociated  condition  sentences  are  processed,  the  situation  is  exactly  the  same  as  in  Figure  4, 
except  that  the  relation  in  P2  is  coded  as  "next  to"  rather  than  "attached  to."  In  this  case,  when  the  mental  model 
is  updated  (after  completing  P6),  the  pointer  to  Sws  is  not  moved,  and  it  receives  far  less  activation  (becau.se  it  is 
not  manipulated  in  any  way).  Consequently,  the  Sws  node  is  not  highly  activated  and  responding  to  the  probe  is 
slower  than  in  the  associated  condition. 

Why  did  we  elect  not  to  represent  in  the  mental  model  the  fact  that  John  and  sweatshirt  are  attached  (in  the 
associated  condition)?  First,  we  envision  the  mental  model  as  extremely  limited  in  capacity  because  it  utilizes  the 
limited  visual-spatiaJ  scratchpad  of  working  memory.  Second,  John  could  enter  into  many  different  types  of 
relations  with  many  different  types  of  objects,  and  it  is  not  clear  how  to  determine  which  ones  should  be 
represented  in  the  mental  model.  In  the  current  version,  the  rule  is  simple:  only  include  in  the  mental  model  the 
relations  being  represented  by  the  spatial  dimensions;  all  other  relations  are  represented  propositionally  in 
memory.  Third,  our  procedure  has  a  natural  consequence.  If  an  object  is  not  integral  to  the  following  text,  the 
pointer  to  the  object's  node  will  soon  be  dropped  from  the  mental  model,  as  illustrated  next. 

The  second  sentence  after  the  critical  associated  sentence  is  "John  has  muscles."  On  processing  this 
sentence,  the  simulation  adds  a  node  fo,  muscles,  and  introduces  a  pointer  to  muscles  into  the  mental  model. 
Propositions  are  formed  encoding  that  John  and  his  muscles  are  attached.  Because  sweaeshirt  is  not  referred  to 
again,  its  activation  is  extremely  low,  and  responding  to  a  probe  will  be  slow  (see  Figure  2).  Furthemiore, 
processing  the  mental  model  does  not  manipulate  the  pointer  to  Sws,  its  activation  drops,  and  it  is  lost  from  the 
mental  model  (when  its  activation  drops  below  a  threshold,  it  is  removed  from  the  model).  Thus,  the  mode!  does 
not  become  cluttered  with  objects  that  could  have  been  relevant,  but  soon  turn  out  to  be  of  little  interest.  Figure  2 
illustrates  both  the  data  from  Glenberg  et  al.  ( 1987)  and  the  results  from  our  simulation  using  the  text  in  Table  2. 

Our  second  demonstration  of  the  simulation  addresses  two  issues.  The  first  is  how  pictures  help 
comprehenders  to  construct  mental  models.  The  second  is  how  mental  models  can  produce  new  learning  based 
on  "noticing"  (Glenberg  &  Langston,  1992).  We  use  results  presented  in  McNamara  et  al.  (in  pres.s)  to 
demonstrate  these  features  of  the  simulation. 

The  McNamara  et  al.  paper  examines  the  contribution  of  spatial  and  temporal  contiguity  to  the  development 
of  spatial  relations.  The  subjects  were  to  learn  the  locations  of  objects  on  a  map,  much  like  that  illu.strated  in 
Figure  5.  Object  locations  are  represented  by  dots,  and  the  names  of  the  objects  (in  the  figure,  not  the  experiment) 
are  given  by  letters  of  the  alphabet.  In  the  experiment,  the  objects  occurred  in  two  regions,  as  indicated  by  the 
heavy  line  down  the  middle  of  the  figure.  After  learning,  subjects  received  several  types  of  tests.  In  the  region 
test,  subjects  had  to  quickly  decide  to  which  region  a  named  object  belonged.  In  the  recognition  test,  the  subject 
simply  decided  if  an  object  name  occurred.  Because  we  have  not  yet  implemented  regions  into  our  simulation  of 
mental  models,  we  will  focus  on  the  recognition  test. 
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Figure  5.  The  picture  used  to  accompany  the  simulation  of  the  McNamara  et  al.  (in  press)  data.  The  left-hand 
panel  illustrates  the  object  names  (letters),  the  order  in  which  the  names  were  presented  (digits),  and 
their  locations  (dots).  The  right-hand  panel  illustrates  the  map-like  stimulus  actually  available  to  the 
subjects. 
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In  the  experiment,  subjects  had  continuous  access  to  a  map  giving  the  locations  of  the  objects,  but  not  the 
object  names,  much  like  the  right  side  of  Figure  5.  The  names  of  the  objects  were  given  one  at  a  tiiw,  by 
presenting  an  object  name  next  to  its  location.  A  critical  variable  was  the  order  in  which  the  names  w  ere 
presented.  The  left  side  of  Figure  5  uses  lowercase  letters  to  represent  object  names  and  Arabic  numerals  to 
represent  order  of  presentation  of  the  names.  Note  that  the  right-hand  side  of  the  figure  corresponds  to  w  hat 
subjects  saw  most  of  the  time;  subjects  never  saw  anything  corresponding  to  the  left-hand  side  of  the  figure. 

Pairs  of  objects  can  be  defined  on  the  basis  of  whether  the  objects  were  temporally  or  spatially  contiguous. 
Thus  Objects  a  and  b  are  both  temporally  and  spatially  contiguous,  whereas  Objects  c  and  d  are  spatially 
contiguous,  but  not  temporally  contiguous.  Objects  e  and  f  are  temporally  contiguous,  but  spatially  distant, 
whereas  Objects  g  and  h  are  temporally  and  spatially  distant.  These  pairs  were  then  used  as  primes  and  targets  on 
the  object-name  recognition  test.  For  example,  a  prime,  the  name  of  Object  a,  would  be  presented,  and  the  subject 
would  respond  "yes."  Next,  a  target,  the  name  of  Object  b,  would  be  presented.  The  question  of  interest  was 
how  the  spatial  and  temporal  relations  between  the  targets  and  the  primes  would  affect  speed  of  responding  to  the 
targets. 

The  response  times  to  the  targets  (collapsed  across  the  experiments  reported  by  McNamara  et  al.)  are  given 
on  the  left-hand  side  of  Figure  6.  Note  the  interaction:  responding  to  a  target  name  is  facilitated  by  a  prime  that 
was  spatially  and  temporally  close  during  acquisition,  but  not  when  the  relation  was  just  spatial  or  just  temporal. 
Data  from  the  simulation  are  presented  on  the  right-hand  side  of  Figure  6. 

To  simulate  these  data,  we  used  a  "picture"  that  provided  metric  information  about  the  locations.  When  an 
object  was  mentioned,  the  simulation  consulted  the  picture,  scaled  the  location  in  the  picture  to  the  dimensions  of 
the  mental  model,  and  entered  into  the  mental  model  a  pointer  representing  the  object.  Thus,  the  location  of  the 
pointer  in  the  mental  model  was  controlled  by  its  location  in  the  picture. 

Whenever  the  mental  model  is  manipulated  (e.g.,  by  entering  a  new  pointer),  the  simulation  engages  in 
noticing.  (Noticing  also  occurred  in  the  simulation  of  the  Glenberg  et  al,  1987  data,  but  discussion  of  it  was 
suppressed  for  clarity).  The  idea  of  noticing  is  to  use  the  structure  of  the  mental  model  to  encode  relations  that  are 
not  given  explicitly  in  the  text.  To  notice,  the  simulation  examines  spatial  locations  within  the  "noticing  radius  "  (a 
free  parameter  in  the  simulation)  of  a  manipulated  pointer.  If  another  pointer  is  within  that  radius,  the  simulation 
encodes  the  relation  between  the  two  pointers,  and  stores  the  proposition  with  pointers  to  the  relevant  nodes. 
Unlike  other  computational  systems  for  generating  inferences,  this  one  is  self-limited  in  three  ways.  First, 
noticing  only  occurs  for  objects  represented  in  the  limited  capacity  mental  model.  Second,  noticing  only  occurs 
for  objects  within  the  noticing  radius  of  manipulated  pointers.  Third,  inferences  are  made  only  about  the  relations 
assigned  (for  the  current  text  domain)  to  the  spatial  dimensions. 

Consider  how  the  simulation  responds  to  the  presentation  of  object  names  in  the  McNamara  et  al. 
experiment  When  Object  a  is  presental,  a  pointer  is  entered  into  the  mental  model,  but  there  is  nothing  to  notice. 
When  a  pointer  to  the  node  for  Object  b  is  entered  into  the  mental  model,  the  relation  to  Object  a  is  noticed  and 
stored.  When  the  pointer  to  the  n<^e  for  Object  c  is  entered  into  the  model,  no  new  relations  are  noticed,  becau.se 
the  pointers  to  Objects  a  and  b  are  outside  the  noticing  radius  of  the  pointer  to  Object  c.  The  next  (fourth)  pointer 
entered  into  the  model  is  the  one  to  the  node  of  Object  g.  With  its  entry,  activation  of  the  pointer  to  Object  a  is  so 
low  that  it  is  dropped  from  the  model.  Similarly,  by  time  the  pointer  to  Object  d  is  entered  (seventh),  the  pointer 
for  Object  c  has  been  dropped.  Thus,  although  two  objects  may  be  spatially  clo.se  in  a  picture,  if  their  pointers  are 
not  concurrently  re.sident  in  the  mental  model,  no  relation  is  noticed. 

On  the  recognition  te.st,  responding  to  the  name  of  an  object  requires  retrieval  of  information  about  the 
object.  That  retrieval  process  activates  various  propositions,  including  propositions  encoding  noticed  relations. 
For  example,  when  the  name  of  Object  a  (the  prime)  is  presented,  the  proposition  encoding  its  noticed  relation  to 
Object  b  is  activated.  This  activated  proposition  partially  activates  the  node  for  Object  b  (the  target),  producing 
facilitation  when  Object  b  is  presented  for  recognition.  On  the  other  hand,  responding  to  the  name  of  Object  c 
does  not  facilitate  responding  to  the  name  of  Object  d,  because  no  relation  was  noticed  between  these  objects. 
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Figure  6.  The  data  on  the  left  are  averages  based  on  the  experiments  presented  in  McNamara  et  al.  ( in  press). 
The  data  on  the  right  are  from  the  simulation  described  in  the  text. 


McNamara,  Halpin,  and  Hardy  Simulation 
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Thus  our  simulation  of  the  McNamara  et  al  experiments  demonstrates  one  way  in  which  pictures  can 

facilitate  comprehension.  In  particular,  a  picture  can  provide  metric  information  about  objects  to  guide  in  the 

construction  of  mental  models.  Then,  the  mental  model  can  be  used  to  notice  (encode)  new  relations,  and  thus 

learning  more  about  the  situation  than  is  given  by  the  text. 
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